An empirical study of multipass decoding for vietnamese LVCSR
نویسندگان
چکیده
In this paper, we represent an empirical study of multipass decoding for Vietnamese LVCSR. We report our experiments with N-best, lattice and consensus decoding on the VNBN data. Results from this study indicate that our acoustic model for Vietnamese was precise. The results could be investigated in further steps to improve the performance of our system. Index Terms Vietnamese, Acoustic Model, Language Model, N-best, Word Lattice, Confusion Network.
منابع مشابه
Large vocabulary continuous speech recognition for Vietnamese, an under-resourced language
This paper proposes a method to build a Vietnamese Large Vocabulary Continuous Speech Recognition system (Vietnamese LVCSR system). The difference between Vietnamese and European languages is analyzed and used to adapt a LVCSR system for European languages to Vietnamese. Experiments are implemented on the VNSPEECHCORPUS. The results show that the accuracy of Vietnamese recognition system is inc...
متن کاملLarge Vocabulary Continuous Speech Recognition for Vietnamese, a Under-resourced Language
This paper proposes a method to build a Vietnamese Large Vocabulary Continuous Speech Recognition system (Vietnamese LVCSR system). The difference between Vietnamese and European languages is analyzed and used to adapt a LVCSR system for European languages to Vietnamese. Experiments are implemented on the VNSPEECHCORPUS. The results show that the accuracy of Vietnamese recognition system is inc...
متن کاملAn Empirical Study of Word Error Minimization Approaches for Mandarin Large Vocabulary Continuous Speech Recognition
This paper presents an empirical study of word error minimization approaches for Mandarin large vocabulary continuous speech recognition (LVCSR). First, the minimum phone error (MPE) criterion, which is one of the most popular discriminative training criteria, is extensively investigated for both acoustic model training and adaptation in a Mandarin LVCSR system. Second, the word error minimizat...
متن کاملEfficient Search Algorithms for Large Vocabulary Continuous Speech Recognition
Automatic speaker-independent speech recognition has made significant progress from the days of isolated word recognition. Today state of the art systems are capable of performing large-vocabulary continuous speech recognition (LVCSR) over complex domains such as news broadcasts and telephone conversations. A significant contribution to this advancement in technology is due to the development o...
متن کاملFirst steps in building a large vocabulary continuous speech recognition system for Vietnamese
This paper presents an overview of our activities for building a Large Vocabulary Continuous Speech Recognition (LVCSR) system for Vietnamese implemented at CLIPS-IMAG Laboratory (France) and International Research Center MICA (Vietnam). Firstly, a new methodology for fast text corpora acquisition for minority languages which has been applied to Vietnamese is proposed. Secondly, the first resul...
متن کامل